-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spike] sdk/log: Change FilterProcessor to Filterer #5825
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5825 +/- ##
=====================================
Coverage 84.5% 84.5%
=====================================
Files 272 273 +1
Lines 22734 22739 +5
=====================================
+ Hits 19224 19231 +7
+ Misses 3167 3165 -2
Partials 343 343 |
@open-telemetry/go-approvers, PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having a similar design in the trace provider.
But I find this naming, Filterer,
a little strange, as I am not sure there is a word for that. Maybe Enabler
? 🤔
@@ -30,6 +29,7 @@ const ( | |||
type providerConfig struct { | |||
resource *resource.Resource | |||
processors []Processor | |||
filterers []Filterer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like the sampler
in TracerProvider
, could we only have a single instance here? If no one is complaining about the current implementation of the sampler
, maybe it is ok to have just one instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach is simpler and more consistent with how processing works.
For reference, the samplers can be decorators e.g. ParentBased Sampler. There is nothing technically preventing creating a filterer decorator. However, I feel that accepting multiple filterers increases the ease of use.
Initially I wanted to name it |
param.SetSeverity(r.Severity()) | ||
if !l.Enabled(ctx, param) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should be possible to make it possible to filter out based on more parameters than passed via Logger.Enabled? E.g. log record body, attributes, etc.
This could be enhanced later without making any breaking change.
Another feedback received during calls is that this proposal is less flexible and makes declaring some pipelines harder. For example, how to define 2 log record processing pipelines with different filters? For such case the filterer would have to be attached to a processor (not globally for the whole SDK) Chaining filters that are coupled to emitted logs is also awkward e.g. // Both processors also implement the Filterer interface.
etw := &ETWLogProcessor{}
proc := &RateLimitingProcessor{
Processor: etw,
Duration: time.Second,
Size: 1000,
}
provider := log.NewLoggerProvider(
log.WithProcessor(proc),
log.WithFilterer(proc),
log.WithFilterer(etw),
) It makes decorating processors more bug-prone. Not mentioning more advanced, nested (tree structure) of processing. Context: open-telemetry/opentelemetry-specification#4407 |
Fixes #4363 Towards #4208 (uses Severity Level passed via Logger.Enabled) Towards stabilization of OpenTelemetry Go Logs API and SDK. ## Use cases Below are some use cases where the new functionality can be used: 1. Bridge features like `LogLevelEnabled` in log bridge/appender implementations. This is needed for **all** (but one) currently supported log bridges in OTel Go Contrib. 2. Configure a minimum log severity level for a certain log processor. 3. Filter out log and event records when they are inside a span that has been sampled out (span is valid and has sampled flag of `false`). 4. **Efficiently** support high-performance logging destination like [Linux user_events](https://docs.kernel.org/trace/user_events.html) and [ETW (Event Tracing for Windows)](https://learn.microsoft.com/windows/win32/etw/about-event-tracing). 5. Bridge Logs API to a language-specific logging library (the other way than usual). ## Changes Add `Enabled` opt-in operation to the `LogRecordProcessor`. I created an OTEP first which was a great for having a lot of discussions and evaluations of different proposals: - #4290 Most importantly from #4290 (comment): > Among Go SIG we were evaluating a few times an alternative to provide some new "filter" abstraction which is decoupled from the "processor". However, we faced more issues than benefits going this route (some if this is described here, but there were more issues: open-telemetry/opentelemetry-go#5825 (comment) . With the current opt-in `Processor.Enabled` we faced less issues so far. > We also do not want to replicate all features from the logging libraries. If someone prefer the log4j (or other) filter design then someone can always use a bridge and use log4j for filtering. `Enabled` callback hook is the simplest design (yet very flexible) which makes it easy to implement in the SDKs. This design is inspired from the design of the two most popular Go structured logging libraries: https://pkg.go.dev/log/slog (standard library) and https://pkg.go.dev/go.uber.org/zap. > > It is worth to adding that Rust design is similar and it also has an `Enabled` hook. See #4363 (comment). Basically we want to add something like https://docs.rs/log/latest/log/trait.Log.html#tymethod.enabled to the `LogRecordProcessor` and allow users to implement `Enabled` in the way that it will meet their requirements. > > I also want to call out form https://github.com/open-telemetry/opentelemetry-specification/blob/main/oteps/0265-event-vision.md#open-questions: > > > How to support routing logs from the Logs API to a language-specific logging library > > To support this we would need a log record processor which bridges the Logs API calls to given logging library. For such case we would need an `Enabled` hook in `Processor` to efficiently bridge `Logger.Enabled` calls. A filterer design would not satisfy such use case. I decided to name the new operation `Enabled` as: 1. this name is already used in logging libraries in many languages: #4439 (comment) 2. it matches the name of the API call (for all trace, metrics and logs APIs). I was also considering `OnEnabled` to have the same pattern as for `Emit` and `OnEmit`. However, we already have `ForceFlush` and `Shutdown` which does not follow this pattern so I preferred to keep the simple `Enabled` name. For `OnEmit` I could also imagine `OnEmitted` (or `OnEmitting`) which does something after (or just before like we have `OnEnding` in `SpanProcessor`) `OnEmit` on all registered processors were called. Yet, I do not imagine something similar for `Enabled` as calling `Enabled` should not have any side-effects. Therefore, I decided to name it `Enabled`. I want to highlight that a processor cannot assume `Enabled` was called before `OnEmit`, because of the following reasons: 1. **Backward compatibility** – Existing processors may already perform filtering without relying on `Enabled`. For example: [Add Advanced Processing to Logs Supplementary Guidelines #4407](#4407). 2. **Self-sufficiency of `OnEmit`** – Since `Enabled` is optional, `OnEmit` should be able to handle filtering independently. A processor filtering events should do so in `OnEmit`, not just in `Enabled`. 3. **Greater flexibility** – Some processors, such as the ETW processor, don’t benefit from redundant filtering. ETW already filters out events internally, making an additional check unnecessary. 4. **Performance considerations** – Calling `Enabled` from `OnEmit` introduces overhead, as it requires converting `OnEmit` parameters to match `Enabled`'s expected input. 5. **Avoiding fragile assumptions** – Enforcing constraints that the compiler cannot validate increases the risk of introducing bugs. This feature is already implemented in OpenTelemetry Go: - open-telemetry/opentelemetry-go#6317 We have one processor in Contrib which takes advantage of this functionality: - https://pkg.go.dev/go.opentelemetry.io/contrib/processors/minsev This feautre (however with some differences) is also avaiable in OTel Rust; #4363 (comment): > OTel Rust also has this capability. Here's an example where it is leveraged to improve performance by dropping unwanted log early. https://github.com/open-telemetry/opentelemetry-rust/blob/88cae2cf7d0ff54a042d281a0df20f096d18bf82/opentelemetry-appender-tracing/benches/logs.rs#L78-L85 --------- Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Sam Xie <[email protected]>
Towards open-telemetry/opentelemetry-specification#4207
Based on Proposal B
This is another idea of implementing filtering to the Logs SDK.
I think this design is more in the spirit of the specification and current SDK design. The initial implementation of
Enabled
was designed when wrapping processors was also needed for implementing log record mutations during processing as log records were passed by value (and not by pointer).This approach should make adding custom filtering easier for the users in most cases (see the updated
ExampleFilterer
).There is a little (mostly CPU) performance overhead caused by looping over filterers in
logger.Emit
, but I do not find it significant.